61 research outputs found

    Understanding Mobile Data Demand regarding Mobility: The report for mid-term thesis evaluation

    Get PDF
    Smartphones are supposedly the fastest-spreading technology in human history. Global mobile data traffic has a growth of 74% in 2015, and is predicted to have an eightfold increase in 2020. Hence the understanding of subscriber’s mobile data demand is of great significance for solutions managing the increasing data traffic as well as improving quality of communication service. A core problem in understanding mobile data demand is to what degree is mobile data traffic predictable? We explore the predictability of data volume for individuals. Specifically, our goal is to determine the maximum probability of forecasting data volume for each subscriber. To this end, we mine a large-scale mobile dataset with both voice traffic and data traffic, construct a dataset of time series of data volume and explore the upper bound of predictability hidden in the time series. We find a overall > 90% of predictability hidden in individual’s time series of data volume

    STGIC: a graph and image convolution-based method for spatial transcriptomic clustering

    Full text link
    Spatial transcriptomic (ST) clustering employs spatial and transcription information to group spots spatially coherent and transcriptionally similar together into the same spatial domain. Graph convolution network (GCN) and graph attention network (GAT), fed with spatial coordinates derived adjacency and transcription profile derived feature matrix are often used to solve the problem. Our proposed method STGIC (spatial transcriptomic clustering with graph and image convolution) utilizes an adaptive graph convolution (AGC) to get high quality pseudo-labels and then resorts to dilated convolution framework (DCF) for virtual image converted from gene expression information and spatial coordinates of spots. The dilation rates and kernel sizes are set appropriately and updating of weight values in the kernels is made to be subject to the spatial distance from the position of corresponding elements to kernel centers so that feature extraction of each spot is better guided by spatial distance to neighbor spots. Self-supervision realized by KL-divergence, spatial continuity loss and cross entropy calculated among spots with high confidence pseudo-labels make up the training objective of DCF. STGIC attains state-of-the-art (SOTA) clustering performance on the benchmark dataset of human dorsolateral prefrontal cortex (DLPFC). Besides, it's capable of depicting fine structures of other tissues from other species as well as guiding the identification of marker genes. Also, STGIC is expandable to Stereo-seq data with high spatial resolution.Comment: Major revision has been made to generate the current version as follows: 1. Writing style has been thoroughly changed. 2. Four more datasets have been added. 3. Contrastive learning has been removed since it doesn't make significant difference to the performance. 4. Two more authors are adde

    Takeaways in Large-scale Human Mobility Data Mining

    Get PDF
    International audienceEmploying mobile devices to perform data analytics is a typical fog computing application that utilizes the intelligence at the edge of networks. Such an application relies on the knowledge of the mobility of mobile devices and their users, e.g., to deploy computation tasks efficiently at the edge. This paper surveys the literature on the mobility-related utilization of operator-collected CDR (charging data records) – the most significant proxy of large-scale human mobility studies. We provide an innovative introductory guide to the CDR data preliminary. It reveals original issues regarding CDR-based mobility feature computation and applications at the edge. Our survey plays an important role in utilizing mobile devices in terms of both human mobility investigation and fog computing

    Towards an Adaptive Completion of Sparse Call Detail Records for Mobility Analysis

    Get PDF
    International audienceCall Detail Records (CDRs) are a primary source of whereabouts in the study of multiple mobility-related aspects. However, the spatiotemporal sparsity of CDRs often limits their utility in terms of the dependability of results. In this paper, driven by real-world data across a large population, we propose two approaches for completing CDRs adaptively, to reduce the sparsity and mitigate the problems the latter raises. Owing to high-precision sampling, the comparative evaluation shows that our approaches outperform the legacy solution in the literature in terms of the combination of accuracy and temporal coverage. Also, we reveal those important factors for completing sparse CDR data, which sheds lights on the design of similar approaches

    Spatio-Temporal Completion of Call Detail Records for Human Mobility Analysis

    Get PDF
    International audienceCall Detail Records (CDRs) have been widely used in the last decades for studying different aspects of human mobility. The accuracy of CDRs strongly depends on the user-network interaction frequency: hence, the temporal and spatial sparsity that typically characterize CDR can introduce a bias in the mobility analysis. In this paper, we evaluate the bias induced by the use of CDRs for inferring important locations of mobile subscribers, as well as their complete trajectories. Besides, we propose a novel technique for estimating real human trajectories from sparse CDRs. Compared to previous solutions in the literature, our proposed technique reduces the error between real and estimated human trajectories and at the same time shortens the temporal period where users’ locations remain undefined

    On the Quest for Representative Behavioral Datasets: Mobility and Content Demand

    Get PDF
    International audienceMobile datasets are widely used as firsthand sources for human mobility research. These datasets are often incomplete or have heterogeneous spatiotemporal resolutions, e.g. a dataset is often aggregated or in lack of fields. In many cases, a reliable dataset in human mobility research comes from sampling or merging original datasets, a challenging task. In this paper, we present our experience on creating a reliable dataset describing mobile data traffic in individual’s spatiotemporal view. We focus on individuals having enough geographical information and merge their call records from one dataset with the data traffic records extracted from another dataset. Based on this dataset, we perform an analysis of user demand on mobile data traffic in terms of spatial and temporal behaviors. For each subscriber, sessions are put into a 3-dimensional space in terms of space, time and volume and are clustered by applying DBScan. Characteristics of are revealed from the statistical analysis on clusters. Subscribers are also categorized according to their clusters

    Filling the Gaps: On the Completion of Sparse Call Detail Records for Mobility Analysis

    Get PDF
    International audienceCall Detail Records (CDRs) have been widely used in the last decades for studying different aspects of human mobility. The accuracy of CDRs strongly depends on the user-network interaction frequency: hence, the temporal and spatial spar-sity that typically characterize CDR can introduce a bias in the mobility analysis. In this paper, we evaluate the bias induced by the use of CDRs for inferring important locations of mobile subscribers, as well as their complete trajectories. Besides, we propose a novel technique for estimating real human trajectories from sparse CDRs. Compared to previous solutions in the literature, our proposed technique reduces the error between real and estimated human trajectories and at the same time shortens the temporal period where users' locations remain undefined

    Relevance of Context for the Temporal Completion of Call Detail Record

    Get PDF
    Call Detail Records (CDRs) are an important source of information in the study of different aspects of human mobility. However, their utility is often limited by spatio-temporal sparsity. In this paper, we first evaluate the effectiveness of CDRs in measuring relevant mobility features. We then investigate whether the information of user's instantaneous whereabouts provided by CDRs enables us to estimate positions over longer time spans. Our results confirm that CDRs ensure a good estimation of radii of gyration and important locations, yet they lose some location information. Most importantly, we show that temporal completion of CDRs is straightforward and efficient: thanks to the fact that they remain fairly static before and after mobile communication activities, the majority of users' locations over time can be accurately inferred from CDRs. Finally, we observe the importance of user's context, i.e., of the size of the current network cell, on the quality of the CDR temporal completion.Les statistiques d’appel (ou en anglais Call Detail Records - CDR) sont une importante source d’information dans l’étude des différents aspects de la mobilité humaine. Cependant,leur utilité est souvent limitée par son spartiété spatio-temporelle. Dans cet article, nous évaluons d’abord l’efficacité de l’utilisation des CDR pour la mesure des caractéristiques de mobilité pertinentes. Nous nous demandons ensuite si les informations de localisation instantanée de l’utilisateur fournies par les CDR nous permettent d’estimer leurs positions sur des périodes longues. Nos résultats confirment que les CDR assurent une bonne estimation des rayons de giration et des emplacements importants, mais ils perdent certaines informations de localisation.Plus important encore, nous montrons que l’achèvement temporel des CDR est simple et efficace:grâce au fait qu’ils restent relativement statiques avant et après les activités de communication mobile, la majorité des emplacements des utilisateurs dans le temps peut être correctement dé-duite des CDR. Enfin, on observe l’importance du contexte de l’utilisateur, c’est-à-dire de la taille de la cellule de réseau actuelle, sur la qualité de l’achèvement temporel des CDR

    Spatio-Temporal Predictability of Cellular Data Traffic

    Get PDF
    The knowledge of the upper bounds of mobile data traffic predictors provides not only valuable insights on human behavior but also new opportunities to reshape mobile network management and services as well as provides researchers with insights into the design of effective prediction algorithms. In this paper, we leverage two large-scale real-world datasets collected by a major mobile carrier in a Latin American country to investigate the limits of predictability of cellular data traffic demands generated by individual users. Using information theory tools, we measure the maximum predictability that any algorithm has potential to achieve. We first focus on the predictability of mobile traffic consumption patterns in isolation. Our results show that it is theoretically possible to anticipate the individual demand with a typical accuracy of 85% and reveal that this percentage is consistent across all user types. Despite the heterogeneity of users, we also find no significant variability in predictability when considering demographic factors or different mobility or mobile service usage. Then, we analyze the joint predictability of the traffic demands and mobility patterns. We find that the two dimensions are correlated, which improves the predictability upper bound to 90% on average

    JSSDR: Joint-Sparse Sensory Data Recovery in Wireless Sensor Networks

    Get PDF
    Abstract-Data loss is ubiquitous in wireless sensor networks (WSNs) mainly due to the unreliable wireless transmission, which results in incomplete sensory data sets. However, the completeness of a data set directly determines its availability and usefulness. Thus, sensory data recovery is an indispensable operation against the data loss problem. However, existing solutions cannot achieve satisfactory accuracy due to special loss patterns and high loss rates in WSNs. In this work, we propose a novel sensory data recovery algorithm which exploits the spatial and temporal jointsparse feature. Firstly, by mining two real datasets, namely the Intel Indoor project and the GreenOrbs project, we find that: (1) for one attribute, sensory readings at nearby nodes exhibit inter-node correlation; (2) for two attributes, sensory readings at the same node exhibit inter-attribute correlation; (3) these inter-node and inter-attribute correlations can be modeled as the spatial and temporal joint-sparse features, respectively. Secondly, motivated by these observations, we propose two JointSparse Sensory Data Recovery (JSSDR) algorithms to promote the recovery accuracy. Finally, real data-based simulations show that JSSDR outperforms existing solutions. Typically, when the loss rate is less than 65%, JSSDR can estimate missing values with less than 10% error. And when the loss rate reaches as high as 80%, the missing values can be estimated by JSSDR with less than 20% error
    • …
    corecore